Intl . Joint Conf . on Neural Networks IJCNN ’ 98 , Anchorage

نویسنده

  • Eduardo Sanchez
چکیده

|Blackjack or twenty-one is a card game where the player attempts to beat the dealer, by obtaining a sum of card values that is equal to or less than 21 so that his total is higher than the dealer's. The probabilistic nature of the game makes it an interesting testbed problem for learning algorithms, though the problem of learning a good playing strategy is not obvious. Learning with a teacher systems are not very useful since the target outputs for a given stage of the game are not known. Instead, the learning system has to explore di erent actions and develop a certain strategy by selectively retaining the actions that maximize the player's performance. This paper explores the use of blackjack as a test bed for learning strategies in neural networks, and speci cally with reinforcement learning techniques. Furthermore, performance comparisons with previous related approaches are also reported. Keywords|Reinforcement learning, SARSA algorithm, Qlearning, Blackjack, learning strategies, arti cial neural net-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synchronized chaos in coupled neuromodules of different types

We discuss the time-discrete parametrized dynamics of two coupled recurrent neural networks. General conditions for the existence of synchronized dynamics are derived for these systems, and it is demonstrated that also the coupling of totally different network structures can result in periodic, quasiperiodic as well as chaotic dynamics constrained to a synchronization manifold M . Stability of ...

متن کامل

Speaker-independent phoneme recognition on TIMIT database using integrated time-delay neural networks (TDNNs)

School of Computer Science Camegie Mellon University Pittsburgh, PA 15213, U.S.A. This paper describes a new structure of Neural Networks ("s) for speaker-independent and context-independent phoneme recognition. This structure is based on the integration of Time-Delay Neural Networks (TDNN) which have several TDNNs separated according to the duration of phonemes. As a result, the proposed struc...

متن کامل

Biologically Inspired Neural Controllers for Motor Control in a Quadruped Robot

This paper presents biologically inspired neural controllers for generating motor patterns in a quadruped robot. Sets of arti cial neural networks are presented which provide 1) pattern generation and gait control, allowing continuous passage from walking to trotting to galloping, 2) control of sitting and lying down behaviors, and 3) control of scratching. The neural controllers consist of set...

متن کامل

Inverse kinematics learning by modular architecture neural networks

Inverse kinematics computation using an artificial neural network that learns the inverse kinematics of a robot arm has been employed by many researchers. However, conventional learning methodologies do not p a y enough attention to the discontinuity of the inverse kinematics system of typical robot arms with joint limits. The inverse kinematics system of the robot arms, including a human arm w...

متن کامل

Neural networks to estimate ML multi-class constrained conditional probability density functions

In this paper, a new algorithm, the Joint Network and Data Density Estimation (XKDDE), is proposed to estimate the 'a posteriori' probabilities of the targets with neural networks in multiple classes problem. It is based on the estimation of conditional dens@ functions for each class with some restrictions or constraints imposed by the classifier structure and the use B a y a rule to force the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998